Unsupervised Detection and Promotion of Authoritative Domains for Medical Queries in Web Search

نویسندگان

  • Manoj K. Chinnakotla
  • Rupesh K. Mehta
  • Vipul Agrawal
چکیده

Medical or Health related search queries constitute a significant portion of the total number of queries searched everyday on the web. For health queries, the authenticity or authoritativeness of search results is of utmost importance besides relevance. So far, research in automatic detection of authoritative sources on the web has mainly focused on a) link structure based approaches and b) supervised approaches for predicting trustworthiness. However, the aforementioned approaches have some inherent limitations. For example, several content farm and low quality sites artificially boost their link-based authority rankings by forming a syndicate of highly interlinked domains and content which is algorithmically hard to detect. Moreover, the number of positively labeled training samples available for learning trustworthiness is also limited when compared to the size of the web. In this paper, we propose a novel unsupervised approach to detect and promote authoritative domains in health segment using click-through data. We argue that standard IR metrics such as NDCG are relevance-centric and hence are not suitable for evaluating authority. We propose a new authority-centric evaluation metric based on side-by-side judgment of results. Using real world search query sets, we evaluate our approach both quantitatively and qualitatively and show that it succeeds in significantly improving the authoritativeness of results when compared to a standard web ranking baseline. ∗Corresponding Author

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type

Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...

متن کامل

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

External Plagiarism Detection based on Human Behaviors in Producing Paraphrases of Sentences in English and Persian Languages

With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique and existing methods suffer from lack of producing accurate queries, Precision and Speed of retrieved result...

متن کامل

مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابه‌جایی وزن‌دار

Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...

متن کامل

مرور مؤثر نتایج جستجوی تصاویر با تلخیص بصری و متنوع از طریق خوشه‌بندی

With unprecedented growth in production of digital images and use of multimedia references, requirement of image and subject search has been increased. Systematic processing of this information is a basic prerequisite for effective analysis, organization and management of it. Likewise, large collections of images have been made available on the Web and many search engines have provided the poss...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014